AITopics | directional derivative

Scaling Gaussian Processes with Derivative Information Using Variational Inference

Neural Information Processing SystemsApr-25-2026, 09:57:07 GMT

Gaussian processes with derivative information are useful in many settings where derivative information is available, including numerous Bayesian optimization and regression tasks that arise in the natural sciences. Incorporating derivative observations, however, comes with a dominating O(N3D3) computational cost when training on N points in D input dimensions. This is intractable for even moderately sized problems. While recent work has addressed this intractability in the low-Dsetting, the high-N, high-Dsetting is still unexplored and of great value, particularly as machine learning problems increasingly become high dimensional. In this paper, we introduce methods to achieve fully scalable Gaussian process regression with derivatives using variational inference. Analogous to the use of inducing values to sparsify the labels of a training set, we introduce the concept of inducing directional derivatives to sparsify the partial derivative information of a training set. This enables us to construct a variational posterior that incorporates derivative information but whose size depends neither on the full dataset size N nor the full dimensionality D. We demonstrate the full scalability of our approach on a variety of tasks, ranging from a high dimensional stellarator fusion regression task to training graph convolutional neural networks on Pubmed using Bayesian optimization. Surprisingly, we find that our approach can improve regression performance even in settings where only label data is available.

artificial intelligence, machine learning, optimization problem, (15 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report (0.68)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)

Add feedback

Asymptotic Theory for Graphical SLOPE: Precision Estimation and Pattern Convergence

Hejný, Ivan, Bonaccolto, Giovanni, Kremer, Philipp, Paterlini, Sandra, Bogdan, Małgorzata, Wallin, Jonas

arXiv.org Machine LearningApr-15-2026

This paper studies Graphical SLOPE for precision matrix estimation, with emphasis on its ability to recover both sparsity and clusters of edges with equal or similar strength. In a fixed-dimensional regime, we establish that the root-$n$ scaled estimation error converges to the unique minimizer of a strictly convex optimization problem defined through the directional derivative of the SLOPE penalty. We also establish convergence of the induced SLOPE pattern, thereby obtaining an asymptotic characterization of the clustering structure selected by the estimator. A comparison with GLASSO shows that the grouping property of SLOPE can substantially improve estimation accuracy when the precision matrix exhibits structured edge patterns. To assess the effect of departures from Gaussianity, we then analyze Gaussian-loss precision matrix estimation under elliptical distributions. In this setting, we derive the limiting distribution and quantify the inflation in variability induced by heavy tails relative to the Gaussian benchmark. We also study TSLOPE, based on the multivariate $t$-loss, and derive its limiting distribution. The results show that TSLOPE offers clear advantages over GSLOPE under heavy-tailed data-generating mechanisms. Simulation evidence suggests that these qualitative conclusions persist in high-dimensional settings, and an empirical application shows that SLOPE-based estimators, especially TSLOPE, can uncover economically meaningful clustered dependence structures.

artificial intelligence, machine learning, matrix, (19 more...)

arXiv.org Machine Learning

2604.12771

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden (0.04)
Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry: Banking & Finance > Trading (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Provably Correct Automatic Sub-Differentiation for Qualified Programs

Sham M. Kakade, Jason D. Lee

Neural Information Processing SystemsMar-14-2026, 13:52:35 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, nonsmooth function, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Provably Correct Automatic Sub-Differentiation for Qualified Programs

Sham M. Kakade, Jason D. Lee

Neural Information Processing SystemsFeb-12-2026, 07:15:52 GMT

Neural Information Processing Systems http://nips.cc/

differentiation, library, nonsmooth function, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

EfficientLearningofGenerativeModelsvia Finite-DifferenceScoreMatching

Neural Information Processing SystemsFeb-10-2026, 18:43:49 GMT

Several machine learning applications involve the optimization of higher-order derivatives(e.g., gradients ofgradients) during training, which can beexpensive with respect to memory and computation even with automatic differentiation.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: